Tandem repeat determination by concurrent analysis of multiple tandem duplex configurations

ABSTRACT

Disclosed is a method of analyzing tandem repeats using one or more probes, each such probe may lack an anchoring sequence but contains one or more tandem repeat sequences complementary to the target tandem repeat sequences. In one embodiment, each probe is attached, via its 5′ end, to an encoded microparticle (“bead”), wherein the code—implemented by way of a color scheme—identifies the sequence and length of the probe attached thereto. Also disclosed are methods relating to the analysis of partial duplex configurations involving only partial overlap between probe and target repeats and thus “overhangs” of probe repeats on the 3′ and/or 5′ ends of the target repeats.

RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application Ser.No. 60/492,859, filed Aug. 6, 2003.

BACKGROUND

The analysis of polymorphisms in the number of repeated DNA sequenceelements (“repeats”) in certain designated genetic loci has a variety ofapplications in molecular diagnostics and biomedical research. Theseapplications include the molecular determination of identity forparentage and forensic analysis, the diagnosis of genetic diseasesincluding Huntington's disease and fragile X syndrome caused by theexpansion of trinucleotide repeats, the analysis of genetic markersrelating to gene regulation, as in the case of dinucleotide repeatpolymorphisms in the transcription region of several cytokines, as wellas mapping and linkage analysis.

Variable number tandem repeats (“VNTRs”) represent one type of repeatlength polymorphism (See Nakamura Y., et al. (1987), Science 235:1616–1622; U.S. Pat. Nos. 4,963,663; 5,411,859) resulting from theinsertion, in tandem, of multiple copies of identical segments of DNA,known as minisatellites, typically 10 bp to 100 bp in length. VNTRmarkers are highly polymorphic, in fact more so than base substitutionpolymorphisms, sometimes displaying up to forty or more alleles at asingle genetic locus. The determination of the number of repeats inVNTRs would provide a means for identity typing but for the fact thatthere are few fast and accurate methods for this purpose. The commonlyused method involves enzymatic digestion of the VNTR-containing nucleicacid segment, followed by Southern blotting, a labor-intensive andtime-consuming procedure. Alternative methods invoking the polymerasechain reaction (PCR) (U.S. Pat. No. 4,683,202) are of limited utility inthe analysis of VNTRs because of PCR's shortcomings in reliablyamplifying segments exceeding 3,000 bases in length. Only a fewamplifiable VNTRs have been developed, making them, as a class,impractical for linkage mapping and identity typing.

More frequent, and more polymorphic than VNTRs are microsatellite loci,consisting of repeating units typically comprising only a few bases.Short tandem repeats (“STRs”) represent an example of “microsatellite”markers. As with amplifiable VNTRs, alleles of microsatellite locidiffer in length, but in contrast to VNTRs contain, in the case of STRs,only two to seven perfect or imperfect repeated sequence elementsdisplaying two, three, four or rarely five bases. Availableamplification protocols produce small products, generally from 60 to 400base pairs in length, permitting the determination of STR repeat numbersby means of amplification followed by gel analysis. Care must be takenin selecting PCR primers in order to eliminate amplification productscontaining alleles of more than one locus.

Hybridization, widely used for the analysis of polymorphisms generally,also has been used for the analysis of variations in the number ofrepeats. Hybridization-mediated analysis of point mutations—i.e., basesubstitution—as well as deletions and insertions, typically involves apair of allele specific oligonucleotides (ASOs) of which one is designedto be complementary to the normal (“wild type”), and the other isdesigned to be complementary to the variant (“mutant”) sequence.However, in “multiplexed” configurations, calling for the concurrentanalysis of multiple polymorphisms, cross-hybridization often limits thereliability of the analysis.

One hybridization-mediated method of analyzing variations in the numberof repeats requires a large number of allele-specific oligonucleotides,each such ASO matching in length one of the alleles. For example, U.S.Pat. No. 6,307,039 discloses a method wherein ASOs are provided so as topermit template-mediated probe extension if, and only if the number ofrepeats in the probe sequence is equal to or less than the number ofrepeats in the target sequence. Using a set of such ASO probes, the setcontaining at least one probe for each anticipated configuration oftarget repeats, the number of target repeats can be determined bymonitoring the outcome of the extension reaction for all probes so as toidentify that probe in the set whose repeat count matches that of thetarget.

This approach has several disadvantages which seriously impair itspractical utility. First, in order to eliminate errors in themeasurement of repeat length due to “slippage”, that is, shifts inprobe-target alignment, probes must contain an “anchoring” sequence ofsufficient length to ensure predictable alignment with the flankingsequence located upstream from the target repeat. Second, target andprobes of increasing lengths form duplexes which contain increasingnumbers of repeats and thus display widely varying thermodynamicstabilities, a feature which renders an isothermal assay protocolimpractical and instead requires careful real-time temperature control.Third, a probe must be provided for each possible target polymorphism, arequirement that implies large probe sets and complex assay protocolsand hence considerable cost. For example, a typical application such asthe implementation of a 13-marker STR panel commonly used for forensicanalysis, will require of the order of ˜100 probes. In cases such asHuntington's disease, characterized by up to forty or more tripletrepeats, each in the large set of requisite probes must be preciselyaligned with the target by way of a long anchoring sequence, implying anassay design of considerable complexity. Finally, this approach does notaccommodate cases involving polymorphisms with an unknown range ofrepeats such as the FGA marker commonly employed for parentage analysis.

A method of concurrent analysis of multiple tandem repeats, invoking anarray of a minimal number of probes, regardless of the possible number,known or unknown, of target repeats, while simplifying the assay design,for example by eliminating the requirement for an anchoring sequence,clearly would be desirable.

SUMMARY

Described is a method of analyzing tandem repeats using one or moreprobes, each such probe may lack an anchoring sequence but contains oneor more tandem repeat sequences complementary to the target tandemrepeat sequences. In one embodiment, each probe is attached, via its 5′end, to an encoded microparticle (“bead”), wherein the code—implementedby way of a color scheme, as shown in FIG. 1—identifies the sequence andlength of the probe attached thereto.

If the number of repeat sequences in the probe is p, and the number ofrepeat sequences in the target is t, then, provided that p is less thant (p<t), probe and target can hybridize—with equal likelihood—in any oft−p+1 possible (degenerate) configurations differing only in the phaseof alignment. These configurations, also referred to herein asfull-length duplex repeat configurations or full-length duplexconfigurations, involve a full overlap of probe repeats with targetrepeats (t>p) or target repeats with probe repeats (t≦p). For example,the possible full-duplex configurations formed between a probe withthree repeats and a target with six repeats are shown in FIG. 2: oneunique “terminal alignment” configuration, wherein the probe is alignedwith the “5′-terminal” nucleotide in the target repeat, and three of“internal alignment” configurations wherein the probe is aligned with“repeat-internal” nucleotides in the target repeat. Differentiallabeling of the terminally aligned duplex states and the internallyaligned duplex states, for example by way of single base extension usingdifferentially labeled ddNTPs as disclosed herein, permits the“counting” of target repeats using one or more probes containing a knownnumber of probe repeats.

Successive determinations of target repeat numbers may be made byplacing probes in solution so as to permit interaction with one target.Preferably, two or more probes designed for the analysis of one targetsequence, these probes having different probe repeat numbers, p1<t andp2<t, p1≠p2, or a set of such two or more probes for the analysis ofmultiple target sequences, are used in a parallel assay format ofanalysis.

Preferably, the temperature of the assay may be set so that, for eachprobe containing p (<t) probe repeats, only full-length duplexes, i.e.,those containing p duplex repeats, are stable, but duplexes containingfewer than p duplex repeats are not. Alternatively, the assaytemperature can be set to ensure stability of a duplex containing atleast p−k, 1≦k<p, duplex repeats. Using standard methods of temperaturecontrol, the assay also may be performed at several operatingtemperatures to monitor the evolution of partial and full duplex statesof differing thermodynamic stability. In other instances, thetemperature may be set to a value exceeding the nominal “melting”temperatures of some or all duplexes, for example when those “melting”temperatures are low compared to the preferred temperature of operationof the polymerase mediating the extension reaction for the labeling ofinternally and terminally aligned states. This condition generally willfavor the formation of partially or completely denatured probe-targetduplex states. Also disclosed are weight functions to model thesesituations.

Also described are methods relating to the analysis of partial duplexconfigurations involving only partial overlap between probe and targetrepeats and thus “overhangs” of probe repeats on the 3′ and/or 5′ endsof the target repeats. The formation of terminally aligned duplex statesand internally aligned duplex states requiring the formation of suchtails (or loops) will be governed by the probability of formation ofeach such configuration which thereby directly affects the assay signal.In order to provide methods of quantitative analysis of experimentaldata obtained, also disclosed are several models of assigningprobabilities to configurations differing in the number of duplexrepeats.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 depicts three different probes of different repeat lengths,encoded by attachment with different microparticles, and a target withsix repeats.

FIG. 2 depicts a probe with three repeats (attached to an encodedmicroparticle) in all possible full-duplex configurations formed betweenthe probe and a target with six repeats, and wherein the probe(following elongation) generates a different signal when bound in theterminal position of the target than in other positions.

FIG. 3A is a plot of intensifies resulting from hybridization-mediatedelongation using different length probes (each of which when bound inthe terminal position on the target generates a different signal thanwhen bound in other positions) and a number of different length targets.

FIG. 3B is a plot of the intensity ratio resulting fromhybridization-mediated elongation using different length probes (each ofwhich when bound in the terminal position on the target generates adifferent signal than when bound in other positions, as in FIG. 3A)plotted against probe repeat number.

FIG. 4 depicts an “offset” encoded probe in all possible full-duplexconfigurations formed between the probe and a target with six repeats,and wherein the probe (following elongation) generates a differentsignal when bound in the terminal position of the target than in otherpositions.

FIG. 5 depicts an encoded probe with six repeat units in all possiblefull-duplex configurations formed between the probe and a target withfour repeats, and wherein the probe (following elongation) generates asignal only when it is bound such that its terminal position is alignedwith the terminal position of the target.

FIG. 6 depicts an adapter fragment has at least two portions, one ofwhich is complementary to a flanking region adjacent to the 5′ end ofthe target repeat and one of which is complementary to a probe repeatunit, and which generates a signal (following extension) when the probeis in the “one repeat overhang” position shown in the lower position inFIG. 6.

FIG. 7 depicts a target which may be enzymatically modified to permittarget extension (and unique labeling) using the probe as a template.

FIG. 8 depicts that one can have partial duplex configurations where theprobe has more repeat units than the target (wherein the probe—followingelongation—generates a different signal when bound in the terminalposition of the target than in other positions).

FIG. 9 depicts that where the probe has more repeat units than thetarget, loops can form in the probe.

FIG. 10A is an intensity plot of probes bound in internal positions onthe target against target length, of full and partial duplexintensities, with different length probes.

FIG. 10B is an intensity plot of probes bound in both internal andterminal positions on the target against target minus probe length, offull and partial duplex intensities.

FIG. 10C is an intensity plot of probes bound in internal positions onthe target against probe length, of full and partial duplex intensities,with different length targets.

FIG. 10D is an intensity plot of the ratio of probes bound in internalpositions on the target over probes bound in terminal positions on thetarget against probe length, of full and partial duplex intensities.

FIG. 11A is a model intensity plot of probes bound in internal positionson the target against target length, of full and partial duplexintensities, with different length probes, in a melting temperaturemodel.

FIG. 11B is a model intensity plot of probes bound to target againsttemperature.

FIG. 11C is a model intensity plot of probes bound in internal positionson the target against probe length, of full and partial duplexintensities.

FIG. 11D is a model intensity plot of the ratio of probes bound ininternal positions on the target over probes bound in terminal positionson the target against target minus probe length, of full and partialduplex intensities.

FIG. 12A is a model intensity plot of probes bound in internal positionson the target against target length, of full and partial duplexintensities, with different length probes, in a thermodynamic weightmodel.

FIG. 12B is a model intensity plot of probes bound in internal positionson the target against target minus probe length, of full and partialduplex intensities, in a thermodynamic weight model.

FIG. 12C is a model intensity plot of probes bound in internal positionson the target against probe length, of full and partial duplexintensities, with different targets, in a thermodynamic weight model.

FIG. 12D is a model intensity plot of the ratio of probes bound ininternal positions on the target over probes bound in terminal positionson the target against target minus probe length, of full and partialduplex intensities, in a thermodynamic weight model.

DETAILED DESCRIPTION

Disclosed is a method of “counting” tandem repeats in one or moredesignated target sequences using one or more oligonucleotideinterrogation probes, each such probe preferably lacking an anchoringsequence but containing one or more tandem repeat sequences that arecomplementary to the target tandem repeat sequence(s) of interest. The(multiplexed) analysis of several targets in a single reaction ispermitted, by providing, for each target, one or more interrogationprobes.

Random Encoded Array Detection (READ)—Preferably, each of the one ormore probes designed for the determination of the number of targetrepeats is attached, via its 5′ end, to an encoded microparticle(“bead”), wherein the code—implemented by way of a color scheme, asshown in FIG. 1—identifies the sequence and length of the probe attachedthereto.I Low Temperature Regime

In general, in the low temperature regime, the operating temperature atwhich assay steps—including enzyme-mediated probe extension—areperformed does not exceed the melting temperatures of the anticipatedfull duplex states. Conversely, it will be advantageous or necessary—forexample in view of the requirement to perform the enzyme-mediatedextension step at a certain temperature—to operate at a temperaturewhich exceeds the melting temperatures of all anticipated partial andfull duplex states. A high temperature regime is described herein inSection II.

I.1 Full Duplex Configurations: p≦t

If the number of repeat sequences in the probe is p, and the number ofrepeat sequences in the target is t, then, provided that p is less thant (p<t), probe and target can hybridize—with equal likelihood—in any oft−p+1 possible “full duplex” configurations involving the entire lengthof the probe, said configurations differing only in the phase ofalignment. For example, the possible duplex configurations formedbetween a probe with three repeats and a target with six repeats areshown in FIG. 2: one unique “terminal alignment” configuration, whereinthe probe is aligned with the “5′-terminal” nucleotide in the targetrepeat, and three of “internal alignment” configurations wherein theprobe is aligned with “repeat-internal” nucleotides in the targetrepeat.

This description applies when duplex configurations involve the fulllength of the probe (p≦t) or the target (p>t) such that all full-lengthduplex repeat configurations, independent of the phase of the alignment,are realized with equal probability. Equal probabilities would beconsistent with the evaluation of the free energy of duplex formation interms of only the subsequences participating in the formation of theduplex. All full-length duplex configurations involving p repeats willthen have the same free energy, that is, they will be “degenerate”configurations. More generally, however, also described is the morecomplex scenario involving partial duplex configurations as described indetail in Section II below.

Differential Labeling of Duplex Configurations in Terminal and InternalAlignment—

Differential labeling of the terminally aligned duplex states and theinternally aligned duplex states, for example by way of single baseextension using differentially labeled ddNTPs as disclosed herein,permits the “counting” of target repeats using one or more probescontaining a known number of probe repeats.

Each probe is selected such that upon hybridizing with a target repeatin a “5′-terminal” alignment—that is, in a configuration placing theprobe's 3′ end in juxtaposition to the “5′-terminal” nucleotide in thecognate target's tandem repeat, the “5-terminal” nucleotide being thatnucleotide in the target repeat located immediately adjacent to the 5′flanking sequence—the probe is labeled with a first color (“Orange” inFIG. 2), and upon hybridizing with a target repeat in a“repeat-internal” alignment—that is, in a configuration placing theprobe's 3′ end in juxtaposition to a nucleotide located in the“interior” of the cognate target's tandem repeat and hence notimmediately adjacent to the 5′ flanking sequence—the probe is labeledwith a second color (“Green” in FIG. 2) differing from said first color.Preferably, labeling is accomplished by template-mediated singlenucleotide extension of the probe's 3′ end, for example by addition of alabeled dideoxynucleotide triphosphate (ddNTP) by methods well known inthe art. Alternative methods, for example probe elongation byincorporation of individual labeled deoxynucleotide triphosphates(dNTPs), also can be used.

Repeat Counting by Concurrent Analysis of Multiple Probe-TargetConfigurations—

The analysis of the results of the hybridization and labeling reactionsdescribed above permits the determination of the number of targetrepeats as follows. Generally, out of t−p+1 possible configurations ofthe duplex, there will be one full-length “5′-terminal” (“external”)alignment, labeled in the first color (“I-Orange”), and t−p full-length“internal” alignments, labeled in the second color (“I-Green”).Consequently, the intensities of green and orange signal recorded fromthe assay will be of the formI _(E) =I-Orange˜1/(t−p+1)andI _(I) =I-Green=(t−p)/(t−p+1)Accordingly, the proportion of I-Green to I-Orange recorded from thearray of probes following extension is proportional to (t−p):I _(I) /I _(E) ˜t−pExamples of calculated profiles are shown in the Examples includedherein.Determination of Target Repeat Number, t:—Applying this informationrelating to intensity ratios permits the determination of an unknownnumber of repeats in a target, x, preferably in an assay configurationproviding internal calibration of the recorded intensity ratios. Forexample, a single probe can be used in combination with a referencetarget containing a known number of repeats identical in composition tothose in the target of interest by comparing said proportion of red andgreen labels obtained in separate interactions of the probe with thereference target and with the target of interest.Multiple Probes—Successive determinations of target repeat numbers maybe made by placing probes in solution so as to permit interaction withone target. Preferably, at least two probes—but no reference target—areused, each probe containing a known number of probe repeats,respectively p₁<t and p₂<t, p₁≠p₂, to construct a standard plot: giventhat the ratio of intensities, I-Green/I-red, is proportional to t−p,the plot of I-Green/I-red vs p would appear as in right-hand panel ofFIG. 3, permitting determination of the unknown number, x, of targetrepeats from the determination of the slope of the plot.

To simplify the analysis, partial duplex configurations, that is thosecontaining alignments other than full-length alignments, may bedestabilized by a suitable choice of temperature or experimentalconditions. Preferably, the temperature of the assay may be adjusted sothat, for each probe containing p probe repeats, only full-lengthduplexes, i.e., those containing p duplex repeats, are stable, butduplexes containing fewer than p duplex repeats are not. Alternatively,the assay temperature can be set to ensure stability of a duplexcontaining at least p−k, 1≦k<p, duplex repeats; the operatingtemperature may be adjusted in the course of the assay to several valuesin accordance with a preset schedule to monitor the evolution of themultiple partial and full duplex configurations with temperature.Temperature scans will be particularly advantageous in order to identifythe presence of multiple targets differing in the number of repeats, t.

“Offsets”—A modification in the design may be made to accommodate aconfiguration wherein the first nucleotide in the 5′ (“downstream”)flanking region of the target is the same as the nucleotide in the 3′(“upstream”) terminal position of the target tandem repeat unit. Withoutmodification in the assay design, probes hybridizing to the 5′ terminalrepeat section of the target will be extended by addition of the samenucleotide—and will thus carry the same label—as probes hybridizing tothe target in “repeat-internal” alignments. A simple modification inprobe design alleviates this ambiguity: two or more nucleotides areadded to each probe, each of these nucleotides chosen to becomplementary to a nucleotide in the target's 5′ flanking region. Thisaddition simply shifts each probe's 3′ terminus to a position such thatextension for “repeat-exterior” alignment produces the first colorwhereas extension of “repeat-internal” alignment produces the secondcolor.

In this same case, wherein the first nucleotide in the immediate 5′(“downstream”) flanking region of the target is the same as the firstnucleotide at the 3′ (“upstream”) end of the target tandem repeat unit,another design modification provides for the use of probes containingtandem repeats which are complementary to the nucleotides in the targetrepeat units, but are effectively offset, as illustrated in FIG. 4. Thatis, the probe repeat unit's first and last nucleotides are shifted withrespect to the corresponding target repeat unit's first and lastnucleotides. Appropriate selection of the first and last nucleotides inthe probe ensures that the probe will be aligned such that its 3′terminal end is juxtaposed to a position other than the target repeatunit's 5′ terminus. Upon probe extension, the nucleotide appended to the3′ end of the probe aligned near the target's 5′ terminal flankingsequence is labeled differently from probes aligned in other positions.

I.2 Full Duplex States with Probe Overhangs (p>t)

Several extensions of this method of the present invention are alsodisclosed. One of these applies to the situation which arises when thenumber of probe repeats, p, exceeds the number of target repeats, t, asshown in FIG. 5. In such a case, full duplex configurations involveoverlap of the t of the p probe repeats with the entire stretch of ttarget repeats, and only “5′-terminal” alignment will permit extension,given that all other configurations having t duplex repeats place theprobe's 3′ terminus in juxtaposition with a portion of the target'sflanking sequence. That is, only the first color signal, but not thesecond color signal, will be produced (see FIG. 5). This observationthus serves as an indication that p is equal to, or exceeds t.Confirmation of this condition may be obtained by one of the followingsteps.

Reduction of Operating Temperature—First, the assay temperature may belowered to a value permitting the formation of stable duplex with fewerthan t duplex repeats. The evaluation of the free energy of duplexformation in terms of participating subsequences, in the manneruniversally applied in the field of molecular biology for purposes ofcalculating “melting temperatures,” invokes a model of summing over freeenergy contributions associated with stacking interactions betweennearest-neighbor base pairs in the duplex. Letting T denote temperature,and γ denote the average free energy per base pair, the “condensation”energy of a duplex of length N will be of the form F_(Cond)/T˜γN; thatis, the stability of the duplex increases as a function of its length.

The elimination of each duplex repeat will lower the meltingtemperature, T_(M), of the duplex by a constant decrement so that thetemperature, T, can be adjusted to as to permit formation of duplexeshaving d=t−1 repeats, but not of duplexes having d=t−2 repeats:T_(M)(d=t−2)≦T<T_(M)(d=t−1). The appearance of the second color inresponse to the lowering of the temperature to the appropriate rangetherefore will confirm that p≧t. Further lowering of the temperature,permitting formation of partial duplexes with even fewer repeats, willlead to a further increase in the strength of the second signal in amanner which reflects the increasing number of possible configurationspermitting “repeat-internal” alignment.

Labeling Configurations with 5′ Overhangs: Adapter Sequences—Second, anoligonucleotide containing an adapter sequence (“adapter”) may beincluded in the assay, as shown in the third and fourth schematics inFIG. 6. The adapter has at least two portions, one of which iscomplementary to a flanking region adjacent to the 5′ end of the targetrepeat and one of which is complementary to a probe repeat unit. Thehinge region of the adapter, serving only the function of holding theadapter in place once hybridization has taken place, may be composed of“neutral” nucleotides capable of forming a bond with any of the fourspecific nucleotides. The latter portion of the adapter includes atleast one additional nucleotide, designated “N,” that is juxtaposed to anucleotide added to the 3′ end of the probe. For example, when the probeis in the “one repeat overhang” position shown in FIG. 6, it can beextended at its 3′ end with a unique ddNTP that is complementary tonucleotide N, and labeled with a unique third color (“red” in FIG. 6)which differs from the first and second colors, such as green andorange, used to identify probe extension in the situations describedabove. The appearance of the third color (“Red”) in response to theaddition of an adapter and an appropriately labeled nucleotide to thereaction mixture of interest, and the absence of signal indicatingrepeat-internal alignment, therefore confirms a probe to have a largernumber of repeats than the target.

In one method of determining the number of target repeats, one can use aseries of probes having increasing numbers of repeats in conjunctionwith an adapter, some of the probes having a number of repeats greaterthan, and others having a number smaller than the number of targetrepeats. In such case, to analyze one or more samples simultaneously,one would attach all the probes to encoded beads, the code identifyingthe length of the attached probe. The adapter and its labeling systemwill identify the presence of probes which have a greater number ofrepeats than does the cognate target in the sample.

Labeling Configurations with 5′ Overhangs: Digestion of Tails—Third, thetarget may be enzymatically modified to permit target extension usingthe probe as a template. For example, a first nucleotide, designated “Z”in FIG. 7, may be inserted immediately adjacent to the 5′ terminus ofthe first probe repeat unit, this first nucleotide being selected so asto differ from the nucleotide at the 3′ end of the probe repeat unit.The strategy, as shown in FIG. 7, is to permit an exonuclease (e.g.,exo 1) to digest accessible “overhanging” single-stranded portions ofthe target, progressing in the 3′ to 5′ direction and leaving the 3′ endof the target's tandem repeat unit available for probe-mediatedextension with labeled ddNTPs. Specifically, the target is extended in asecond step with a nucleotide labeled with a first color (“green”) whentarget and probe repeat units are aligned as shown in FIG. 7 or inequivalent configuration, and with a nucleotide matching the “Z”nucleotide in the probe sequence and labeled with a second differentcolor (“Orange”) otherwise. After determining the relative intensitiesof green to orange labels, using the plot shown in FIG. 3, one usesessentially the same method described in the associated text todetermine the number of repeats in the unknown target—noting, however,that in this case I-Green/I-Orange is proportional to p−t.II Partial Duplex Configurations

The aforementioned analysis of full duplex configurations to theanalysis of the full set of configurations involving “tails” (FIG. 8)including partial duplex configurations in which the number of duplexrepeats, d, is smaller than either the number of probe repeats, p, andthe number of target repeats, t. Slippage, that is, shifts in the phaseof the alignment of probe-target repeats, must be anticipated especiallyin the absence of chosen “anchor” sequences flanking the probe repeatson the 3′ or 5′ side. In addition to tails, “loops” also may form (FIG.9).

The presence of partial duplex configurations manifests itself inexperimental data, for example, in the form of finite green signalintensity recorded under the condition t−p=0. The presence of loops willmanifest itself in the form of higher than expected green and orangesignals recorded under the condition t<p.

Partial Duplex Configurations with Tails: Serial Product—Given a probecontaining p repeats and a target containing t repeats, the set of allpossible partial and full duplex configurations is readily enumerated byrepresenting probe and target repeats by strings of 1's and computingthe serial product of the two strings. For example, with p=3, t=5, theserial product of <1 1 1> and <1 1 1 1 1> produces the string <1 2 3 3 32 1> of length 3+5−1. Generally, the serial product of a string P oflength p and a string T of length t will be a string, P*T, of lengtht+p−1, each field of the string giving the number of duplex repeats inthe corresponding configuration and thus representing the density ofstates. The serial product is conveniently evaluated by resorting tomatrix multiplication (Seul, O'Gorman and Sammon, “Practical Algorithmsfor Image Analysis”, Cambridge University Press, 2000), as illustratedin Example III. Of the t+p−1=(t−p+1)+2(p−1) possible configurations,t−p+1 are degenerate full duplex configurations, 2(p−1) are partialduplex configurations.Heterozygosity: Two Targets of differing Repeat Numbers—In the contextof applying tandem repeat analysis to identity typing, samples typicallywill be heterozygous for repeat number polymorphism. That is, theanalysis must reveal the presence of, and identify, two target strands,T₁ and T₂ of differing repeat numbers, t₁=t(T₁)≠t₂=t(T₂). The set ofcorresponding duplex configurations is readily evaluated using thedistributive property of the serial product: P*T₁+P*T₂=P*(T₁+T₂),yielding a string of length max(t₁, t₂)+p−1. The resulting density ofstates will differ from that of either target alone, as illustrated inExample III.III Non-degenerate Configurations: Weights

In certain instances, as in the case of a certain range of preferredextension temperatures, the assay may require an operating temperaturethat exceeds the nominal “melting” temperatures of some duplexes,T_(M)(d=1)<T_(M)(d=2), . . . <T< . . . <T_(M)(d=min(p, t)), or allduplexes, T_(M(d=)1)<T_(M)(d=2), . . . <T_(M)(d=min(p, t))<T. Thiscondition generally will favor the formation of partially or completelydenatured probe-target duplex states in which all configurationsdisplaying d duplex repeats will be formed with a certain probability,w(d).

Partially or completely denatured probe-target duplex states generallywill not all form, or may not all be subject to labeling, with equalprobability. Disclosed, therefore, is a method to introduce weightfunctions, w=w(d), reflecting the probability of formation of labeledduplex configurations.

The weight functions are used herein to model the probability offormation of an observable configuration which requires a first step ofannealing of target (T) to a probe (P) and a second step of labeling ofthe probe-target complex (PT), for example by way of single-baseextension, requiring the reaction of the complex with an enzyme (E) andthe reaction of the resulting complex with labeled substrate (S*) toproduce the labeled probe, (S*−P).

P+T+E+S*⇄PT+E+S*⇄PTE+S*⇄S*−PT+E⇄S*−P+T+E

wherein each reaction equilibrium is governed by a pair of kinetic rateconstants, k_(ON) and k_(OFF). The reaction involves the recycling oftarget (“template”) as well as enzyme, permitting for isothermal“accumulation” of labeled product, S*−P. In this regard, the multiplexedanalysis of duplex repeat configurations illustrates the more generalcase of elongation or extension-mediated multiplexed analysis ofpolymorphisms. Labeling with ddNTPs effectively removes one constituentingredient, namely probe, from the reaction. In a preferred embodiment,labeled product is immobilized by attachment to a solid phase carriersuch as an encoded microparticle.

Setting d*=d_(max), wherein d_(max)=p for t≧p and d_(max)=t for t<p,that is, d*=min(p, t), expressions for the total intensity, I_(E), of5′-terminally aligned states, and the total intensity, I_(I), ofinternally aligned states, are readily evaluated:I _(E) =w(d*)/t+p−1andI _(I) =w(d*){Σ_((d=1) to (d=d*−1))(w(d)/w(d*))+(t−p)}/(t+p−1) p<torI _(I) =w(d*){Σ_((d=1) to (d=d*−1))(w(d)/w(d*))}/(t+p−1) p≧t

Defining the function H(x) to denote the step function H=0, x≦0, H(x)=1,x>0; permits the use of the compact notationI _(I) =w(d*){Σ_((d=1) to (d=d*−1))(w(d)/w(d*))+(t−p)H(t−p)}/(t+p−1)Defining the function

-   -   g_(d)=2; d<d*    -   g_(d)=t−p; d=d*        to denote the multiplicity of each duplex configuration, the        expressions for I_(E) and I_(I) are seen to represent partial        sums contributing to the partition function        Z=1/(t+p−1)Σ_((All Duplex Configurations)) g _(d) w(d)        Also of interest is the ratio:        I _(I) /I _(E)={Σ_((d=1) to (d=d*−1))(w(d)/w(d*))+(t−p)H(t−p)}

Example IV illustrates the evaluation of these expression for specificweight functions, w=w(d). The computed profiles will permit the analysisof experiments—for example, a set of “standard curves” may be computedto show the expected variation of intensities, I_(E) and I_(I), as afunction of the number of target repeats, t, for partial duplexconfigurations formed between a target and two or more probes of p₁, p₂,. . . , repeats, the p_(j) preferably bracketing t. Alternatively, trialprofiles may be employed in regression analysis of experimental data todetermine t, the number of duplex repeats in the target.

III.1 The Role of Configurational Entropy in Duplex Formation

The “condensation” energy of a duplex of length N will be of the formF_(Cond)/T˜γN, T denoting temperature, and γ denoting the average freeenergy per base pair. However, this contribution must be balancedagainst the loss in configurational entropy. The condensation of twoflexible single nucleic acid strands into a “bound” state in the form ofa “stiff” double-stranded (ds) duplex configuration containing d repeatsreduces the number of configurations available to each strand incomparison to those available to each strand in its respective “free”state. For example, only a single state, namely the full duplex state,remains in the special case d=p=t, while for an unconstrained singlechain of length s≧t, the number of configurations, enumerated as thenumber of random walks on a lattice of coordination number z, varies as˜z^(S). Under this scenario, increasingly longer probe-target duplexconfigurations will be increasingly less likely, especially at hightemperature.

Thus, duplex formation by way of the condensation of two single strandsof length N from “random coil” configurations into a duplex of d<<Nrepeats implies a loss of configurational entropy of the two originalstrands. As with the confinement of a polymer chain (deGennes, “ScalingConcepts in Polymer Physics”, Cornell University Press, 1979), thiscondensation requires a deformation of the target strand so as to formthe locally stiff duplex state with a probe strand which must undergo asimilar deformation.

The free energy of deformation has the form (F_(El)/T)˜K(T)(L/L₀)^(δ),where L₀˜aN^(v), is the characteristic size of a free target strandcontaining N monomers of characteristic size a and L is thecharacteristic size of the target in the duplex state; v(3d)=3/5 andδ=1/(1−v) are exponents characterizing the statistical behavior of“real” polymer chains in good solvents. If that state contains N_(D)monomers in d repeats (as does the probe), then L_(D)=aN_(D) is thelength of that “stiff” full duplex; the target “overhangs” thus containa total of N−N_(D) monomers and—assuming, for simplicity, an equaldistribution of monomers between the two tails—have a characteristicsize L_(T)˜a (½(N−N_(D)))^(v). Thus,

$\begin{matrix}{F_{EI}/{\left. T \right.\sim\left( {L/L_{0}} \right)^{\delta}}} \\{{\sim\left\lbrack {{2{a\left( {\frac{1}{2}\left( {N - N_{D}} \right)} \right)}^{v}} + {a\; N_{D}}} \right\rbrack^{\delta}}/\left\lbrack {a\; N^{v}} \right\rbrack^{\delta}} \\{\sim{\left( {2/2^{v}} \right)\left\lbrack {\left( {1 - {N_{D}/N}} \right)^{v} + {\left( {N_{D}/N} \right)N^{1 - v}}} \right\rbrack}^{\delta}}\end{matrix}$In the limit N_(D)/N˜1:F_(El)/T˜(2/2^(v))[(N_(D)/N)N^(1−v)]^(δ)˜N_(D)In the limit N_(D)/N<<1:F_(El)/T˜(2/2^(v))[1+(N_(D)/N)(N^(1−v)−v)]^(δ)

In the limit, N_(D)/N˜1, with d˜N_(D), the total free energy of duplexformation has the form F/T˜(K−γ)N_(D), wherein both the elasticconstant, K, and the condensation energy per pair, γ, will depend (ingenerally differing form) on temperature, and w(d)˜exp(−(K−γ)d/ΔT),suggesting a model of geometric weights such as that discussed inExample IV. The constant c=c(T)=exp(−(K−γ)/ΔT) changes sign at aspecific transition temperature; when the entropic term dominates,formation of a duplex of increasing d will become increasingly lesslikely.

EXAMPLES Example I

Calculated melting temperatures of duplex repeats in FES, a markercommonly used for parentage analysis, display a characteristic incrementwith the number of duplex repeats, permitting the choice of assaytemperature so as to select a minimal number of repeats inthermodynamically stable duplex repeats:

Duplex Repeats 4 6 8 9 10 11 12 13 Calculated 27.1 39.1 44.7 46.5 48.049.3 50.3 51.3 Melting Temperature (° C.)

At T=T_(m), a fraction of one half of the corresponding probe-targetduplex configurations is “unbound”, and this fraction increases as afunction of increasing ΔT=T−T_(m). At a typical operating temperature of60° C. for polymerase-mediated extension, all FES markers shown in thetable will thus be predominantly in their unbound configuration.

Example II

Evaluation of the serial product of two strings yields a stringcontaining the number of duplex repeats in each possible configuration;it is conveniently evaluated by matrix multiplication.

II.1 Serial Product of p=<1 1 1 1 1> and t=<1 1 1>:

$\begin{matrix}1000000 & 1 \\1100000 & 1 \\1110000 & 1 \\1111000 & 0 \\1111100 & 0 \\0111110 & 0 \\0011111 & 0\end{matrix} = \begin{matrix}1 \\2 \\3 \\3 \\3 \\2 \\1\end{matrix}$II.2 Serial Product of p=<1 1 1> and t=<1 1 1 1 1>:

$\begin{matrix}1000000 & 1 \\1100000 & 1 \\1110000 & 1 \\0111000 & 1 \\0011100 & 1 \\0001110 & 0 \\0000111 & 0\end{matrix} = \begin{matrix}1 \\2 \\3 \\3 \\3 \\2 \\1\end{matrix}$

Target and matching probe sequences are mutually reverse complementary.The formation of duplex repeats thus corresponds to the parallelevaluation of the convolution of probe and target repeat sequences andrepresents an instance of parallel computation (“DNA computing”).

Example III

In the tables below, full length duplex and partial duplexconfigurations with internal alignment are indicated in enlarged type,full length and partial duplex configurations with terminal alignmentare indicated in the column with enlarged type, bolded, italicized type;the partial duplex configurations to the right of the column withenlarged type, bolded, italicized type will remain unlabeled (unlessspecial steps are taken as described herein).

Duplex configurations formed by probes containing p=4, 6, 8, 10, 12repeats with target containing t=8 repeats, obtained by evaluation ofserial products of unit strings P, T.

p t + p − 1 Duplex Repeats 4 11 1 2 3 4 4 4 4

3 2 1 6 13 1 2 3 4 5 6 6

5 4 3 2 1 8 15 1 2 3 4 5 6 7

7 6 5 4 3 2 1 10 17 1 2 3 4 5 6 7

8 8 7 6 5 4 3 2 1 12 19 1 2 3 4 5 6 7

8 8 8 8 7 6 5 4 3 2 1

Duplex configurations formed by probes containing p=4, 6, 8, 10, 12repeats with target containing t=10 repeats, obtained by evaluation ofserial products of unit strings P, T.

p t + p − 1 Duplex Repeats 4 11 1 2 3 4 4 4 4 4 4

3 2 1 6 13 1 2 3 4 5 6 6 6 6

5 4 3 2 1 8 15 1 2 3 4 5 6 7 8 8

7 6 5 4 3 2 1 10 17 1 2 3 4 5 6 7 8 9

9 8 7 6 5 4 3 2 1 12 19 1 2 3 4 5 6 7 8 9

10 10 9 8 7 6 5 4 3 2 1

Duplex configurations formed by probes containing p=4, 6, 8, 10, 12repeats with a mixture of two targets, one containing t=8 repeats, theother 10 repeats, obtained by evaluation of serial products of unitstrings P and a string representing the sum of unit strings T(t=8) andT(t=10).

p t + p − 1 Duplex Repeats 4 11 1 2 4 6 7 8 8 8 8

6 4 2 6 13 1 2 4 6 8 10 11 12 12

10 8 6 4 2 8 15 1 2 4 6 8 10 12 14 15

14 12 10 8 6 4 2 10 17 1 2 4 6 8 10 12 14 16

17 16 14 12 10 8 6 4 2 1 2 4 6 8 10 12 14 16

18 18 17 16 14 12 10 8 6 4 2

Example IV

Weight Functions

In the cases considered below, c denotes a constant which generally willdepend on experimental parameters such as temperature, ionic strength,pH as well as properties of fluorescent dyes or other labels used increating the assay signal. More generally, to reflect differences in thechemical properties of dye labels or spectral variations in the opticalresponse of the experimental apparatus employed to read intensities inthe two color channels, it will be desirable to allow for thepossibility of using two constants, c_(E)≠c_(I).

IV.1 Simple Trial Functions

IV.1.1 w(d)=c

In the presence of enzyme in large excess over target, any probe-targetcomplex with matching configuration has an essentially equal chance ofbeing recognized by enzyme, and—in the presence of labeled dNTP orddNTP—extension will produce labeled probe regardless of the number ofduplex repeats in the probe-target complex.

The expressions take the form:I _(E) =c/t+p−1andI _(I)=(c/t+p−1){(d*−1)+(t−p)H(t−p)}Also of interest is the ratio:I _(I) /I _(E)=(d*−1)+(t−p)H(t−p);IV.1.2 Linear Weights: w(d)=cd

If the formation of extended probe is governed by the formation of aduplex between probe and target, thermodynamic stability of the complexwill increase with increasing d: the larger the number of duplex repeatsof a given configuration, the more likely the probability, w(d) offormation of that configuration, and the more likely its being labeledby an enzyme-mediated extension reaction.

Using a simple trial function to represent the proportionality of w(d)to d, the expressions take the form:I _(E) =cd*/t+p−1andI _(I)=(cd*/t+p−1){½(d*−1)+(t−p)H(t−p)and, for the ratio:I _(I) /I _(E)={½(d*−1)+(t−p)}=½(p−1)+(t−p)H(t−p)For the special case t=p=d*, these expressions simplify to:I _(E) ={ct/(2t−1)}andI _(I) ={ct/(2t−1)}½(t−1),and, for the ratio:I _(I) /I _(E)=½(t−1)The ratio profile permits the determination of t by determination of theintercept.

For the case c_(E)=c_(I)=c, intensity profiles, I_(E), I_(I) andI_(I)/I_(E), as a function of t, p and t−p are shown in FIGS. 10A to10D. In order to determine the number of repeats in a target sequence ofinterest, the target is permitted to form a duplex with two or moreprobes—preferably displayed on encoded beads—and the pattern ofintensities for external and for internal termination is analyzed. Forexample, the intercept of the ratio I_(I)/I_(E)=½(t−1) FIG. 10D) permitsthe direct determination of t.

The profile for external termination displays a small decrease with t(FIG. 10B), reflecting the increase in the total number, t+p−1, ofpossible configurations, while the profile for internal terminationdisplays an increase even for t−p=0, reflecting the formation of partialduplex configurations.

IV.1.3 High Temperature Regime: w(d)=1/(T−T_(m)(d))

In the high temperature regime, T>T_(m)(d*), under appropriate reactionconditions, the formation of labeled product by enzyme-catalyzed probeextension may be governed by the probability of forming a probe-targetduplex containing d duplex repeats. This probability will decrease withΔT(d)=T−T_(m)(d): the shorter d, the less likely the formation of thecorresponding probe-target complex wherein T_(m)(d) denotes the“melting” temperature, T_(m)(d), of a complex containing d duplexrepeats.

A simple trial function representing the high temperature portion(T≧T_(m)) of a “melting curve” is obtained by assuming a constantincrement δT in melting temperature per addition of a single repeat andsettingT _(m)(d)=T _(m)(d*)−(d*−d)δTso thatw(d)=1/(T−T _(m)(d*)+(d*−d)δT)or, with C(d*)=T−T_(m)(d*):w(d)=1/(C+(d*−d)δT)Note that, since d*=min(p, t), each probe-target pair is characterizedby a different value of C=C(d*).

Making the (rather drastic) approximation C(d*)=C, independent of d*,the expressions assume the form:I _(E)=1/C(t+p−1)andI _(I)=1/C(t+p−1)[{C/(C+(d*−1)*T)+C/(C+(d*−1)*T)+ . . . +1}+(t−p)H(t−p)And, for the ratio:I _(I) /I _(E) ={C/(C+(d*−1)*T)+C/(C+(d*−1)*T)+ . . . +1}+(t−p)H(t−p)Examples of calculated profiles are shown in FIG. 11A to 11D.

This trial function may be generalized in various ways, for example byintroducing an exponent, β≠1, to represent the high temperature portionof the melting curve, by permitting δT to depend on d, or generally bysupplying explicit calculated or experimentally determined meltingtemperatures, and by evaluating the parameter C=C(d*) for eachprobe-target pair.

Preferably, to analyze experimental data obtained with a set of probesof known p, regression analysis would be used to obtain d* and hence t.

IV.2 Thermodynamic Weights: w(d)=c w(d−1)

In analogy to the classic “zipper” models developed to describe thehelix-coil transition of a polypeptide chain and the condensation of apair of nucleic acid strands into a duplex (Cantor & Schimmel,“Biophysical Chemistry”, Vol 3, 1981), the partition function of theprobe-target duplex repeat may be represented in the formZ=1/(t+p−1)Σ_((0<d<=d*)) g _(d) exp(−dΔF ₀ /kT)where ΔF₀/kT represents the free energy of duplex formation per basepair which may be augmented by a nucleation term, σ=exp(−ΔF_(N)/kT).Setting w₁=w(d=1)=exp(−ΔF₀/kT)=c, the expression assumes the formZ=1/(t+p−1)Σ_((0<d<=d*)) g _(d) σc ^(d)Neglecting the nucleation term, this leads to the geometric weightfunction w(d)=c_(d) and the following expressions for the intensities:I _(E) =c ^(d*) /t+p−1andI _(I) =c ^(d*){(1−c ^(d*−1))/c ^(d*)(1−c)+(t−p)H(t−p)}/t+p−1and, for the ratio:I _(I) /I _(E)=(1−c ^(d*−1))/c^(d*)(1−c)+(t−p)H(t−p)

The nucleation term, σ, will reflect the increasing entropic penalty offorming a duplex of increasing length, as discussed herein. For example,the probability of forming the first duplex repeat between a probe and atarget having t repeats within a sequence of total length L will beinversely proportional to the volume of the coil formed by the target insolution, ˜1/L³; that is, with L˜aN^(v), N denoting the total number ofnucleotides in the sequence, the probability of initial pair formationwill scale as ˜(t/N)^(3v); σ may be treated as an additional parameterfor purposes of regression analysis of experimental data.

IV.3 Other Weight Functions

Two-State Model—A situation of interest to experiment occurs when two ormore probes are provided to capture a given target at a set operatingtemperature, T; of the multiple partial and full duplex configurations,those with d≦d^ duplex repeats are in the high temperature regime(T_(M)(d≦d^)<T) while those with d>d^ duplex repeats are in the lowtemperature regime (T<T_(M)(d>d^)), d^ denoting the value of dindicating cross-over from low to high temperature. While the cross-overgenerally will reflect the shape of the duplex “melting” curve (Cantor &Smith, “Genomics”), a simple but instructive model results under theassumption of a melting curve with a step at d=d^ such that all (partialor full duplex) configurations with d≦d^ are assigned a weightw(d)=c_(Low) and all those with d>d^ are assigned a weightw(d)=c_(High). Explicit expressions are obtained for the trial functionin Examples IV.1, treating d^ and the ratio c_(Low)/c_(High) asadjustable parameters.Preferred Configurations: Modulations—More complex situations may bedescribed by constructing appropriate weight functions, w=w(d). Aninteresting situation arises when the pitch of 10 base pairs/turn of BDNA of random sequence is incommensurate with the preferred alignment ofprobe and target repeats, each repeat containing, say, four bases, suchthat repeat boundaries in probe and target strands are juxtaposed (seeFIG. 2). In commensurate alignment for certain values of d may result inundertwisting or overtwisting of the helix formed by the probe-targetcomplex, increasing the free energy, and hence reducing the probabilityof formation of the corresponding duplex states.

An example of this type would be the preference of configurations inwhich the product rd, r denoting the number of bases in the repeat,matches the pitch of the double helix, i.e., rd/10=n, n denoting aninteger. For example, with r=4, duplex configurations with d=5, d=10,d=15, etc would display enhanced intensities relative to other duplexconfigurations, leading to modulations in intensity of profiles. Asimple trial function, based on Example IV.1, would bew(d)=c+(−1)^(d)δc, producing “odd”-“even” modulations; more generalfunctional forms also are possible. Modulated versions of the weightfunctions in Examples IV.2 and IV.3 also may be considered.

Loops—At high temperature, long probes, p>t, also can display loops(FIG. 9), permitting the probe to form two or more stretches of duplexwith the target. While the probability of formation of a loop of drepeats, ˜d^(−3/2), is lower than the probability of formation of a tailof d repeats, ˜d^(−1/2), multiple such loops can form (and migrate alongthe target), a fact that reduces the probe's loss of configurationalentropy. The method of the present invention is readily extended toinclude loop configurations by counting the number of such states (seee.g., Fleer et al, “Polymers at Interfaces”, Chapt 4).

It should be understood that the terms, examples and expressions aboveare exemplary only, and not limiting, that steps in method claims can beperformed in any order, unless otherwise specified, and that the scopeof the invention is defined only in the claims which follow, andincludes all equivalents of the subject matter of the claims.

1. A method for determining the number of tandem repeat units in atarget oligonucleotide, comprising: selecting at least two probes, underthe following selection criteria: (i) each probe has repeated sequenceelements (“probe repeats”), each probe repeat being complementary to thetarget oligonucleotide's repeat unit (“target repeat”), but each probehaving fewer probe repeats than the total number of target repeatspresent in the target oligonucleotide; and (ii) each probe, uponhybridizing to the target oligonucleotide under conditions permittingthe formation of a duplex of at least one repeat unit, is labeled with afirst color in configurations in which the probe's 3′ end is aligned injuxtaposition to a target repeat nucleotide which is not the 5′ terminaltarget repeat nucleotide, and labeled with a second, different color inthe configuration in which the probe's 3′ end is aligned injuxtaposition to the 5′ terminal target repeat nucleotide; hybridizingsaid at least two probes to one target oligonucleotide, and determining,for each of said at least two probes the intensities of the respectivesignals of the first and second colors; analyzing the intensities of thesignals resulting from the hybridization step where said intensitiesreflect the probability of different duplex configurations for probesand targets having respectively particular probe repeats and targetrepeats; and determining the number of tandem repeats in the targetoligonucleotide by regression analysis.
 2. The method of claim 1 whereinthe probability for each possible duplex configuration is weighted suchthat for increasing reaction temperature, the formation of said duplexis less likely.
 3. A method for determining the number of tandem repeatunits in a target oligonucleotide, comprising: selecting at least oneprobe, under the following selection criteria: (i) the probe has anumber (p) of tandem repeat sequences complementary to the target tandemrepeat sequences, but has fewer tandem repeats than the total number (t)of tandem repeats present in the target oligonucleotide; and (ii) theprobe, upon hybridizing to the target oligonucleotide to form a duplexof p repeat units, is labeled with a first color in configurationshaving the probe's 3′ end aligned in juxtaposition to a target repeatnucleotide which is not the 5′ terminal nucleotide, and labeled with asecond, different color in the configuration having the probe's 3′ endaligned in juxtapostion to the 5′ terminal target repeat nucleotide;hybridizing said probe to one target oligonucleotide under conditionsensuring the formation of a duplex of p (<t) repeats and determining forthat probe the intensities of the respective signals of the first andsecond colors; and determining the length of the unknown targetoligonucleotide using the formula: when the number of repeated sequenceelements in the probe is (p), and the number of repeated sequenceelements in the target is (t), then the ratio of the intensity of thesignal of the first color to the intensity of the signal of the secondcolor associated with the probe, is proportional to t−p.
 4. The methodof claim 3, comprising the additional step of providing internalcalibration by permitting a second probe of length p′, differing inlength from that of the first probe, p, to form a duplex of p′ repeatunits and determining for that probe the intensities of the respectivesignals of the first and second colors.
 5. The method of claim 3 whereinan assay temperature is selected which is higher than the meltingtemperature of partial duplex configurations so as to favor formationexclusively of full lenath duplex configurations.
 6. The method of claim3, comprising the additional step of providing calibration of saidintensity ratios by permitting the probe to form a duplex of p repeatunits with a reference target comprising a known number of repeatedsequence elements and determining for that probe the intensities of therespective signals of the first and second colors.
 7. The method ofclaim 6 wherein the additional step is performed following thehybridization step of said probe to one target oligonucleotide.
 8. Themethod of claims 1 or 3 used to determine the number of tandem repeatunits in a plurality of target oligonucleotides, wherein the 5′ ends ofeach of a group of probes with different numbers of tandem repeats areattached to an encoded particle, encoded such that the identity of theprobe attached can be determined by decoding.
 9. The method of claims 1or 3 wherein the labeling step is performed by 3′ extension of the probeby a single labeled ddNTP, wherein a first ddNTP, labeled with the firstcolor, is complementary to the target nucleotide X in the 3′ terminalposition of the target repeat unit, and a second ddNTP, labeled with thesecond color, is complementary to the target nucleotide Y in theposition immediately adjacent to the 5′ terminus of the 5′ terminaltandem repeat; provided that X and Y are not the same.
 10. The method ofclaim 1 or 3 wherein the labeling step is performed by 3′ extension ofthe probe by a single labeled ddNTP, wherein a first ddNTP, labeled withthe first color, is complementary to the target nucleotide A in the 3′terminal position of the target repeat unit, and a second ddNTP, labeledwith the second color, is complementary to a target nucleotide B in aposition in the region adjacent to the 5′ end of the 5′ terminal tandemrepeat but not in the position immediately adjacent to the 5′ end of the5′ terminal tandem repeat; provided that A and B are not the same. 11.The method of claims 1 or 3 wherein the first and last nucleotides inthe probe's tandem repeats do not align with the first and lastnucleotides in the target's tandem repeats, and wherein, uponhybridization of the probe to the 5′ terminal repeat end of the target,the target nucteotide located immediately adjacent to the targetnucleotide juxtaposed to the probe's 3′ terminus is different from thenucleotide in the 3′ terminal position of a target repeat unit.
 12. Themethod of claims 1 or 3 used to simultaneously determine the number oftandem repeat units in a plurality of target oligonucleotides, whereinthe 5′ end of each of a group of selected probes differing in the numberof probe repeats is attached to an encoded particle, encoded such thatthe identity of the probe attached can be determined by decoding.
 13. Ina method of determining the number of tandem repeat units in a target,identifying probes having a greater number of tandem repeat units thanthe number of tandem repeats in the target, comprising: selecting aplurality of probes, under the following selection criteria: (i) eachprobe has a series of tandem repeat sequences complementary to thetarget tandem repeat sequences, and some probes (p′) have more tandemrepeats than the total number of tandem repeats present in the targetoligonucleotide; and (ii) upon hybridizing with a targetoligonucleotide, where each probe's 3′ end aligns with a target tandemrepeat nucleotide which is not the 5′ terminal tandem repeat targetnucleotide, the probe is labeled with a first color, and where its 3′end aligns with a target tandem repeat nucleotide which is the 5′terminal target nucleotide, the probe is labeled with a second color;providing an adapter oligonucleotide segment having two portions, afirst of which is complementary to the flanking region adjacent to the5′ end of the target tandem repeat units and a second of which iscomplementary to a tandem repeat section of the probes, the secondportion including at least one additional nucleotide which can alignwith a nucleotide added to the 3′ end of the p′ probes; provided thatsaid additional nucleotide is not the same as the nucleotide at the 3′end of a target repeat unit or the nucleotide at the 3′ end of thetarget flanking sequence; and provided that a probe p′ having its 3′ endaligned with any part of the adapter is labeled with a third color;hybridizing at least one said probe (which is not a p′ probe) to atleast one target oligonucleotide of known Length under conditionsensuring the formation of a duplex of p (<t) repeats, and determiningthe intensities of the respective signals from the first color andsecond colors; hybridizing at least one said probe to at least onetarget oligonucleotide having an unknown number of repeat units, or toboth an oligonucleotide and to a portion of the adapter sequence alignedwith the 3′ terminal repeat unit of the probe, said adapter beinghybridized to said target oligonucleotide, under conditions ensuring theformation of a duplex of p (≦t) repeats and determining the presence ofthe first color, and if there is any, the intensities of the respectivesignals from the first and second colors, and if there is no firstcolor, determine the presence of any of the third color; but where thefirst color is present; and determining the length of the unknown targetoligonucleotide using the formula: when the number of repeat sequencesin the probe is (p), and the number of repeat sequences in the target is(t), then the ratio of the intensity of the signal of the first colorover the intensity of the signal of the second color is proportional top−t.
 14. The method of claim 13 used to simultaneously determine thenumber of tandem repeat units in a plurality of target oligonucleotideswherein the 5′ ends of each of a group of probes with different numbersof tandem repeats are attached to an encoded particle, encoded such thatthe identity of the probe attached can be determined by decoding. 15.The method of claim 13 wherein the labeling is done by adding labeledddNTPs to the 3′ terminal end of the probe, wherein a first ddNTPnucleotide labeled with the first color is complementary to thenucleotide X, a second ddNTP nucleotide labeled with the second colordifferent than the first color is complementary to a nucleotide Y, and athird ddNTP nucleotide labeled with the third color different iscomplementary to the nucleotide N.
 16. The method of claim 13 whereinhybridizing steps are not done in a sequential manner.
 17. A method fordetermining the number of tandem repeat units in a targetoligonucleotide, comprising: selecting a plurality of probes, under thefollowing selection criteria: (i) each probe has a series of tandemrepeat sequences, p, complementary to the target tandem repeatsequences, t, and all probes have more tandem repeats than the totalnumber of tandem repeats present in a target oligonucleotide; and (ii)upon hybridizing with a target oligonucleotide, where each probe's 5′end aligns with a target tandem repeat nucleotide which is not the 3′terminal tandem repeat target nucleotide, the target is labeled with afirst color, and where a probe's 5′ end aligns with a target tandemrepeat nucleotide which s the 3′ terminal target nucleotlde, the targetis labeled with a second different color; hybridizing at least one saidprobe to at least one target oligonucleotide of known length underconditions ensuring the formation of a duplex of p (>t) repeats;cleaving from the probe and target the un-annealed 3′ terminal portions;determining the intensities of the respective signals from the first andsecond colors; hybridizing at least one said probe to at least onetarget oligonucleotide of unknown length; cleaving from the probe andtarget the un-annealed 3′ terminal portions; determining the intensitiesof the respective signals from the first and second colors; anddetermining the length of the unknown target oligonucleotide using theformula: the ratio of the intensity of the signal of the first colorover the intensity of the signal of the second color associated with thesame probe as that associated with the first color, is proportional top−t.
 18. The method of claim 17 used to simultaneously determine thenumber of tandem repeat units in a plurality of target oligonucleotides,wherein the 5′ ends of each of a group of probes with different numbersof tandem repeats are attached to an encoded particle, encoded such thatthe identity of the probe attached can be determined by decoding. 19.The method of claim 17 wherein the labeling is done by adding labeledddNTPs to the 3′ terminal end of the probe, wherein a first ddNTPnucleotide labeled with the first color is complementary to thenucleotide at the 3′ end of the probe terminal repeat unit, and a secondddNTP nucleotide labeled with the second color is complementary to asecond nucleotide Y in the probe sequence immediately adjacent to the 5′end of the 5′ terminal tandem repeat unit in the probe; provided thatthe first and second nucleotides are not the same.
 20. The method ofclaim 17 wherein cleaving is done using exonuclease I.